115 research outputs found

    Identifying the Context Shift between Test Benchmarks and Production Data

    Full text link
    Machine learning models are often brittle on production data despite achieving high accuracy on benchmark datasets. Benchmark datasets have traditionally served dual purposes: first, benchmarks offer a standard on which machine learning researchers can compare different methods, and second, benchmarks provide a model, albeit imperfect, of the real world. The incompleteness of test benchmarks (and the data upon which models are trained) hinder robustness in machine learning, enable shortcut learning, and leave models systematically prone to err on out-of-distribution and adversarially perturbed data. The mismatch between a single static benchmark dataset and a production dataset has traditionally been described as a dataset shift. In an effort to clarify how to address the mismatch between test benchmarks and production data, we introduce context shift to describe semantically meaningful changes in the underlying data generation process. Moreover, we identify three methods for addressing context shift that would otherwise lead to model prediction errors: first, we describe how human intuition and expert knowledge can identify semantically meaningful features upon which models systematically fail, second, we detail how dynamic benchmarking - with its focus on capturing the data generation process - can promote generalizability through corroboration, and third, we highlight that clarifying a model's limitations can reduce unexpected errors. Robust machine learning is focused on model performance beyond benchmarks, and as such, we consider three model organism domains - facial expression recognition, deepfake detection, and medical diagnosis - to highlight how implicit assumptions in benchmark tasks lead to errors in practice. By paying close attention to the role of context, researchers can design more comprehensive benchmarks, reduce context shift errors, and increase generalizability

    The antiobesity factor WDTC1 suppresses adipogenesis via the CRL4 WDTC 1 E3 ligase

    Get PDF
    Abstract WDTC1/Adp encodes an evolutionarily conserved suppressor of lipid accumulation. While reduced WDTC1 expression is associated with obesity in mice and humans, its cellular function is unknown. Here, we demonstrate that WDTC1 is a component of a DDB1‐CUL4‐ROC1 (CRL4) E3 ligase. Using 3T3‐L1 cell culture model of adipogenesis, we show that disrupting the interaction between WDTC1 and DDB1 leads to a loss of adipogenic suppression by WDTC1, increased triglyceride accumulation and adipogenic gene expression. We show that the CRL4WDTC1 complex promotes histone H2AK119 monoubiquitylation, thus suggesting a role for this complex in transcriptional repression during adipogenesis. Our results identify a biochemical role for WDTC1 and extend the functional range of the CRL4 complex to the suppression of fat accumulation

    A massive nebula around the Luminous Blue Variable star RMC143 revealed by ALMA

    Get PDF
    The luminous blue variable (LBV) RMC143 is located in the outskirts of the 30~Doradus complex, a region rich with interstellar material and hot luminous stars. We report the 3σ3\sigma sub-millimetre detection of its circumstellar nebula with ALMA. The observed morphology in the sub-millimetre is different than previously observed with HST and ATCA in the optical and centimetre wavelength regimes. The spectral energy distribution (SED) of RMC143 suggests that two emission mechanisms contribute to the sub-mm emission: optically thin bremsstrahlung and dust. Both the extinction map and the SED are consistent with a dusty massive nebula with a dust mass of 0.055±0.018 M⊙0.055\pm0.018~M_{\odot} (assuming Îș850=1.7 cm2 g−1\kappa_{850}=1.7\rm\,cm^{2}\,g^{-1}). To date, RMC143 has the most dusty LBV nebula observed in the Magellanic Clouds. We have also re-examined the LBV classification of RMC143 based on VLT/X-shooter spectra obtained in 2015/16 and a review of the publication record. The radiative transfer code CMFGEN is used to derive its fundamental stellar parameters. We find an effective temperature of ∌8500\sim 8500~K, luminosity of log(L/L⊙)=5.32(L/L_{\odot}) = 5.32, and a relatively high mass-loss rate of 1.0×10−5 M⊙1.0 \times 10^{-5}~M_{\odot}~yr−1^{-1}. The luminosity is much lower than previously thought, which implies that the current stellar mass of ∌8 M⊙\sim8~M_{\odot} is comparable to its nebular mass of ∌5.5 M⊙\sim 5.5~M_{\odot} (from an assumed gas-to-dust ratio of 100), suggesting that the star has lost a large fraction of its initial mass in past LBV eruptions or binary interactions. While the star may have been hotter in the past, it is currently not hot enough to ionize its circumstellar nebula. We propose that the nebula is ionized externally by the hot stars in the 30~Doradus star-forming region.Comment: Paper accepted by A&A on 09/05/2019 and in proof stage. Second comments by referee are included in this versio

    Improving dermatology classifiers across populations using images generated by large diffusion models

    Full text link
    Dermatological classification algorithms developed without sufficiently diverse training data may generalize poorly across populations. While intentional data collection and annotation offer the best means for improving representation, new computational approaches for generating training data may also aid in mitigating the effects of sampling bias. In this paper, we show that DALL⋅\cdotE 2, a large-scale text-to-image diffusion model, can produce photorealistic images of skin disease across skin types. Using the Fitzpatrick 17k dataset as a benchmark, we demonstrate that augmenting training data with DALL⋅\cdotE 2-generated synthetic images improves classification of skin disease overall and especially for underrepresented groups.Comment: NeurIPS 2022 Workshop on Synthetic Data for Empowering ML Researc

    Mathematical and computational models of drug transport in tumours

    Get PDF
    The ability to predict how far a drug will penetrate into the tumour microenvironment within its pharmacokinetic (PK) lifespan would provide valuable information about therapeutic response. As the PK profile is directly related to the route and schedule of drug administration, an in silico tool that can predict the drug administration schedule that results in optimal drug delivery to tumours would streamline clinical trial design. This paper investigates the application of mathematical and computational modelling techniques to help improve our understanding of the fundamental mechanisms underlying drug delivery, and compares the performance of a simple model with more complex approaches. Three models of drug transport are developed, all based on the same drug binding model and parametrized by bespoke in vitro experiments. Their predictions, compared for a ‘tumour cord’ geometry, are qualitatively and quantitatively similar. We assess the effect of varying the PK profile of the supplied drug, and the binding affinity of the drug to tumour cells, on the concentration of drug reaching cells and the accumulated exposure of cells to drug at arbitrary distances from a supplying blood vessel. This is a contribution towards developing a useful drug transport modelling tool for informing strategies for the treatment of tumour cells which are ‘pharmacokinetically resistant’ to chemotherapeutic strategies

    SelfClean: A Self-Supervised Data Cleaning Strategy

    Full text link
    Most benchmark datasets for computer vision contain irrelevant images, near duplicates, and label errors. Consequently, model performance on these benchmarks may not be an accurate estimate of generalization capabilities. This is a particularly acute concern in computer vision for medicine where datasets are typically small, stakes are high, and annotation processes are expensive and error-prone. In this paper we propose SelfClean, a general procedure to clean up image datasets exploiting a latent space learned with self-supervision. By relying on self-supervised learning, our approach focuses on intrinsic properties of the data and avoids annotation biases. We formulate dataset cleaning as either a set of ranking problems, which significantly reduce human annotation effort, or a set of scoring problems, which enable fully automated decisions based on score distributions. We demonstrate that SelfClean achieves state-of-the-art performance in detecting irrelevant images, near duplicates, and label errors within popular computer vision benchmarks, retrieving both injected synthetic noise and natural contamination. In addition, we apply our method to multiple image datasets and confirm an improvement in evaluation reliability

    Towards Reliable Dermatology Evaluation Benchmarks

    Full text link
    Benchmark datasets for digital dermatology unwittingly contain inaccuracies that reduce trust in model performance estimates. We propose a resource-efficient data cleaning protocol to identify issues that escaped previous curation. The protocol leverages an existing algorithmic cleaning strategy and is followed by a confirmation process terminated by an intuitive stopping criterion. Based on confirmation by multiple dermatologists, we remove irrelevant samples and near duplicates and estimate the percentage of label errors in six dermatology image datasets for model evaluation promoted by the International Skin Imaging Collaboration. Along with this paper, we publish revised file lists for each dataset which should be used for model evaluation. Our work paves the way for more trustworthy performance assessment in digital dermatology.Comment: Link to the revised file lists: https://github.com/Digital-Dermatology/SelfClean-Revised-Benchmark

    Art and the science of generative AI: A deeper dive

    Full text link
    A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation. The generative capabilities of these tools are likely to fundamentally alter the creative processes by which creators formulate ideas and put them into production. As creativity is reimagined, so too may be many sectors of society. Understanding the impact of generative AI - and making policy decisions around it - requires new interdisciplinary scientific inquiry into culture, economics, law, algorithms, and the interaction of technology and creativity. We argue that generative AI is not the harbinger of art's demise, but rather is a new medium with its own distinct affordances. In this vein, we consider the impacts of this new medium on creators across four themes: aesthetics and culture, legal questions of ownership and credit, the future of creative work, and impacts on the contemporary media ecosystem. Across these themes, we highlight key research questions and directions to inform policy and beneficial uses of the technology.Comment: This white paper is an expanded version of Epstein et al 2023 published in Science Perspectives on July 16, 2023 which you can find at the following DOI: 10.1126/science.adh445
    • 

    corecore